Packages

library(arules)
## Loading required package: Matrix
## 
## Attaching package: 'arules'
## The following objects are masked from 'package:base':
## 
##     abbreviate, write
setwd("C:/Users/Cholian/OneDrive/GU/Semester1/ANLY-501-04/portfolio/Project draft/ARM and Networks")

Read transaction data

inspecting raw data

Twitter_data <- read.transactions("cleaned_twitter_basket.csv",
                           rm.duplicates = FALSE, 
                           format = "basket",  ##if you use "single" also use cols=c(1,2)
                           sep=",",  ## csv file
                           cols=1 )## The dataset HAS row numbers
inspect(Twitter_data[1:2])
##     items               transactionID
## [1] {airdrop,                        
##      best,                           
##      binance,                        
##      binancesmartchain,              
##      bnb,                            
##      bsc,                            
##      congratulation,                 
##      cryptocurrency,                 
##      fishytankgame,                  
##      lediemxn,                       
##      ledongvpx,                      
##      metaverse,                      
##      nft,                            
##      pancakeswap,                    
##      playnumbrearn,                  
##      playtoearn,                     
##      project,                        
##      team,                           
##      thanhluannd}                   0
## [2] {airdrop,                        
##      binance,                        
##      binancesmartchain,              
##      bnb,                            
##      bsc,                            
##      busy,                           
##      cryptocurrency,                 
##      fishytankgame,                  
##      game,                           
##      greet,                          
##      metaverse,                      
##      nft,                            
##      pancakeswap,                    
##      playnumbrearn,                  
##      playtoearn,                     
##      project}                       1
# rows of data
length(Twitter_data)
## [1] 4993

Findings and explanations

This transaction data is transformed by Twitter tweets texts. The size of the raw data has 4993 tweets. For example, in the above data sample, the first tweet has been filtered by stop words and some of the remaining words like the “airdrop”, “best”, “binance”, etc.

Apply the apriori algorithm

##### Use apriori to get the RULES
rules_twitter_all = arules::apriori(Twitter_data, parameter = list(support=.35, 
                                                 confidence=.5, minlen=2))
## Apriori
## 
## Parameter specification:
##  confidence minval smax arem  aval originalSupport maxtime support minlen
##         0.5    0.1    1 none FALSE            TRUE       5    0.35      2
##  maxlen target  ext
##      10  rules TRUE
## 
## Algorithmic control:
##  filter tree heap memopt load sort verbose
##     0.1 TRUE TRUE  FALSE TRUE    2    TRUE
## 
## Absolute minimum support count: 1747 
## 
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[8378 item(s), 4993 transaction(s)] done [0.04s].
## sorting and recoding items ... [14 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 3 4 5 6 7 8 9 10 done [0.00s].
## writing ... [24418 rule(s)] done [0.01s].
## creating S4 object  ... done [0.02s].
# Removing inverted (reverse/duplicate) rules
gi <- generatingItemsets(rules_twitter_all)
d <- which(duplicated(gi))
rules_twitter <-  rules_twitter_all[-d]

inspect(rules_twitter[1:5])
##     lhs          rhs                 support   confidence coverage  lift    
## [1] {airdrop} => {binance}           0.4346085 1          0.4346085 2.300922
## [2] {airdrop} => {binancesmartchain} 0.4346085 1          0.4346085 2.300922
## [3] {airdrop} => {bsc}               0.4346085 1          0.4346085 2.300922
## [4] {airdrop} => {playtoearn}        0.4346085 1          0.4346085 2.300922
## [5] {airdrop} => {playnumbrearn}     0.4346085 1          0.4346085 2.300922
##     count
## [1] 2170 
## [2] 2170 
## [3] 2170 
## [4] 2170 
## [5] 2170
(summary(rules_twitter))
## set of 4074 rules
## 
## rule length distribution (lhs + rhs):sizes
##   2   3   4   5   6   7   8   9  10 
##  69 221 495 792 924 792 495 220  66 
## 
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   2.000   5.000   6.000   5.996   7.000  10.000 
## 
## summary of quality measures:
##     support         confidence        coverage           lift      
##  Min.   :0.4346   Min.   :0.9993   Min.   :0.4346   Min.   :1.000  
##  1st Qu.:0.4346   1st Qu.:1.0000   1st Qu.:0.4346   1st Qu.:1.000  
##  Median :0.4346   Median :1.0000   Median :0.4346   Median :1.000  
##  Mean   :0.4347   Mean   :1.0000   Mean   :0.4347   Mean   :1.642  
##  3rd Qu.:0.4346   3rd Qu.:1.0000   3rd Qu.:0.4346   3rd Qu.:2.270  
##  Max.   :0.5738   Max.   :1.0000   Max.   :0.5738   Max.   :2.301  
##      count     
##  Min.   :2170  
##  1st Qu.:2170  
##  Median :2170  
##  Mean   :2171  
##  3rd Qu.:2170  
##  Max.   :2865  
## 
## mining info:
##          data ntransactions support confidence
##  Twitter_data          4993    0.35        0.5

Thresholds

The thresholds here are 0.35 support, 0.5 confidence, 2 minimal length.

Findings and explanations

After applying the apriori algorithm and removing inverted (reverse/duplicate) rules, there are still remaining 4074 rules. Most of the rule lengths are concentrated between the 4 to 8 keywords. The range of support is from 0.4346 to 0.5738, implying that most keywords are all high-frequency words. The highest lift is 2.301, showing a high association between some keywords, and this will be reviewed again in the later part.

Plot of frequent items

## Plot of which items are most frequent
arules::itemFrequencyPlot(Twitter_data, topN=20, type="absolute")

Findings and explanations

The top several frequency words are cryptocurrency, numbr, https, bnb, airdrop, binance and bsc, which is highly similar to the previous EDA part.

Sorted rules

Sorted by confidence (top15)

## Sort rules by a measure such as conf, sup, or lift
sort_rule_twitter_con <- sort(rules_twitter, by="confidence", decreasing=TRUE)
inspect(sort_rule_twitter_con[1:15])
##      lhs          rhs                 support   confidence coverage  lift    
## [1]  {airdrop} => {binance}           0.4346085 1          0.4346085 2.300922
## [2]  {airdrop} => {binancesmartchain} 0.4346085 1          0.4346085 2.300922
## [3]  {airdrop} => {bsc}               0.4346085 1          0.4346085 2.300922
## [4]  {airdrop} => {playtoearn}        0.4346085 1          0.4346085 2.300922
## [5]  {airdrop} => {playnumbrearn}     0.4346085 1          0.4346085 2.300922
## [6]  {airdrop} => {fishytankgame}     0.4346085 1          0.4346085 2.300922
## [7]  {airdrop} => {pancakeswap}       0.4346085 1          0.4346085 2.300922
## [8]  {airdrop} => {nft}               0.4346085 1          0.4346085 2.300922
## [9]  {airdrop} => {metaverse}         0.4346085 1          0.4346085 2.300922
## [10] {airdrop} => {bnb}               0.4346085 1          0.4346085 2.269545
## [11] {airdrop} => {cryptocurrency}    0.4346085 1          0.4346085 1.000000
## [12] {binance} => {binancesmartchain} 0.4346085 1          0.4346085 2.300922
## [13] {binance} => {bsc}               0.4346085 1          0.4346085 2.300922
## [14] {binance} => {playtoearn}        0.4346085 1          0.4346085 2.300922
## [15] {binance} => {playnumbrearn}     0.4346085 1          0.4346085 2.300922
##      count
## [1]  2170 
## [2]  2170 
## [3]  2170 
## [4]  2170 
## [5]  2170 
## [6]  2170 
## [7]  2170 
## [8]  2170 
## [9]  2170 
## [10] 2170 
## [11] 2170 
## [12] 2170 
## [13] 2170 
## [14] 2170 
## [15] 2170

Findings and explanations

Arounding the airdrop, there are some information:

  1. The airdrop activity is associated with the binance, which is one of the most famous exhcange for cryptocurrency.

  2. BSC is a special chain supported by the binance, which will be one of the most important tools or dependence that to be related to the NFT. In the fact, most of the NFT activities will rely on the BSC (Binance smart chain)

  3. The mainstream of the airdrop activities is still focusing on attracting the public, this is the reason why “playtoearn”, “playnumbrearn” and other words ranked in the top position.

Sorted by support (top15)

sort_rule_twitter_sup <- sort(rules_twitter, by="support", decreasing=TRUE)
inspect(sort_rule_twitter_sup[1:15])
##      lhs              rhs                 support   confidence coverage 
## [1]  {numbr}       => {cryptocurrency}    0.5738033 1.0000000  0.5738033
## [2]  {https}       => {cryptocurrency}    0.5659924 1.0000000  0.5659924
## [3]  {https}       => {numbr}             0.5655918 0.9992923  0.5659924
## [4]  {https,numbr} => {cryptocurrency}    0.5655918 1.0000000  0.5655918
## [5]  {bnb}         => {cryptocurrency}    0.4406169 1.0000000  0.4406169
## [6]  {airdrop}     => {binance}           0.4346085 1.0000000  0.4346085
## [7]  {airdrop}     => {binancesmartchain} 0.4346085 1.0000000  0.4346085
## [8]  {airdrop}     => {bsc}               0.4346085 1.0000000  0.4346085
## [9]  {airdrop}     => {playtoearn}        0.4346085 1.0000000  0.4346085
## [10] {airdrop}     => {playnumbrearn}     0.4346085 1.0000000  0.4346085
## [11] {airdrop}     => {fishytankgame}     0.4346085 1.0000000  0.4346085
## [12] {airdrop}     => {pancakeswap}       0.4346085 1.0000000  0.4346085
## [13] {airdrop}     => {nft}               0.4346085 1.0000000  0.4346085
## [14] {airdrop}     => {metaverse}         0.4346085 1.0000000  0.4346085
## [15] {airdrop}     => {bnb}               0.4346085 1.0000000  0.4346085
##      lift     count
## [1]  1.000000 2865 
## [2]  1.000000 2826 
## [3]  1.741524 2824 
## [4]  1.000000 2824 
## [5]  1.000000 2200 
## [6]  2.300922 2170 
## [7]  2.300922 2170 
## [8]  2.300922 2170 
## [9]  2.300922 2170 
## [10] 2.300922 2170 
## [11] 2.300922 2170 
## [12] 2.300922 2170 
## [13] 2.300922 2170 
## [14] 2.300922 2170 
## [15] 2.269545 2170

Findings and explanations

  1. “numbr” represented by the “number”. However, it is still confused why people substitute the “number” with “numbr”.
  2. “bnb” is a token supported by Binance, where its quotes highly follows that of mainstream tokens, like the ETH and BTC
  3. As said before, Airdrop is mostly supported by Binance.

Sorted by lift (top15)

sort_rule_twitter_lift <- sort(rules_twitter, by="lift", decreasing=TRUE)
inspect(sort_rule_twitter_lift[1:15])
##      lhs          rhs                 support   confidence coverage  lift    
## [1]  {airdrop} => {binance}           0.4346085 1          0.4346085 2.300922
## [2]  {airdrop} => {binancesmartchain} 0.4346085 1          0.4346085 2.300922
## [3]  {airdrop} => {bsc}               0.4346085 1          0.4346085 2.300922
## [4]  {airdrop} => {playtoearn}        0.4346085 1          0.4346085 2.300922
## [5]  {airdrop} => {playnumbrearn}     0.4346085 1          0.4346085 2.300922
## [6]  {airdrop} => {fishytankgame}     0.4346085 1          0.4346085 2.300922
## [7]  {airdrop} => {pancakeswap}       0.4346085 1          0.4346085 2.300922
## [8]  {airdrop} => {nft}               0.4346085 1          0.4346085 2.300922
## [9]  {airdrop} => {metaverse}         0.4346085 1          0.4346085 2.300922
## [10] {binance} => {binancesmartchain} 0.4346085 1          0.4346085 2.300922
## [11] {binance} => {bsc}               0.4346085 1          0.4346085 2.300922
## [12] {binance} => {playtoearn}        0.4346085 1          0.4346085 2.300922
## [13] {binance} => {playnumbrearn}     0.4346085 1          0.4346085 2.300922
## [14] {binance} => {fishytankgame}     0.4346085 1          0.4346085 2.300922
## [15] {binance} => {pancakeswap}       0.4346085 1          0.4346085 2.300922
##      count
## [1]  2170 
## [2]  2170 
## [3]  2170 
## [4]  2170 
## [5]  2170 
## [6]  2170 
## [7]  2170 
## [8]  2170 
## [9]  2170 
## [10] 2170 
## [11] 2170 
## [12] 2170 
## [13] 2170 
## [14] 2170 
## [15] 2170

Findings and explanations

Sorting by the lift, some positive association came out. For example, airdrop, binance, bsc are mostly associated with each other, and the main theme of airdrop-related activities is attracting people to increasing activity degree.

Summry or discussion of ARM in the data.

Discussion on ARM and apriori algorithm

ARM represents by “Association rule mining”, where it is aimed to find out association rules that satisfy predicted minimum support, confidence from a given dataset.

In this project, the dataset is composed of tweets related to some keywords: “NFT”, “Cryptocurrency”, “Ethereum”, “BTC”, “Defi”, and so on, which counts up to 4074 rules. After removing the stopwords and meanless words (for example, some words are made up together which counts up to 15 alphabets), the remaining words or rules are all keywords that people now discussed.

In applying the apriori algorithm, the parameters or thresholds are set by 0.35 support and 0.5 confidence. The 35% support shows that it filtered some low-frequency rules, which take an account lower than 35% of the whole dataset. The 50% confidence help filter those lowly positive associated rules.

Conclusion on findings

In summary, after inspecting the above result, it is obviously on the following findings:

  1. The mainstream topic (in September) in concurrency tweets is related to the “NFT”, especially about the “airdrop” activities.

  2. Binance and BSC are important supports or essential parts when discussing NFT.

  3. Cryptocurrencies are still mostly on the web rather than real content.

  4. The trend of the crypto world is still the “Common sense”, where the whole market or area is still needing to attract people come to participate.

Visualization for professionals

tcltk

library(tcltk)
library(arulesViz)

Support

subrules <- head(sort(rules_twitter, by="support"),50)

# interactive plot
plot(subrules, method="graph", engine="htmlwidget")
plot(subrules, method="graph", engine = "ggplot2")

Confidence

subrules <- head(sort(rules_twitter, by="confidence"),50)

plot(subrules, method="graph", engine="htmlwidget")
plot(subrules, method="graph", engine = "ggplot2")

Lift

subrules <- head(sort(rules_twitter, by="lift"),50)

plot(subrules, method="graph", engine="htmlwidget") #%>%
  #visNodes(scaling = list(label = list(enabled = FALSE)))
plot(subrules, method="graph", engine = "ggplot2")

Specific rules

By RHS (etherum by support)

## Selecting or targeting specific rules  RHS
ethRules <- apriori(data=Twitter_data,parameter = list(supp=.01, conf=.01, minlen=2),
                     appearance = list(default="lhs", rhs="ethereum"),
                     control=list(verbose=FALSE))
ethRules <- sort(ethRules, decreasing=TRUE, by="support")
inspect(ethRules[1:15])
##      lhs                               rhs        support   confidence
## [1]  {coinhuntworld}                => {ethereum} 0.1393952 0.4957265 
## [2]  {find}                         => {ethereum} 0.1393952 0.4946695 
## [3]  {awesome}                      => {ethereum} 0.1393952 0.4873950 
## [4]  {play}                         => {ethereum} 0.1393952 0.4846797 
## [5]  {join}                         => {ethereum} 0.1393952 0.4464400 
## [6]  {https}                        => {ethereum} 0.1393952 0.2462845 
## [7]  {numbr}                        => {ethereum} 0.1393952 0.2429319 
## [8]  {cryptocurrency}               => {ethereum} 0.1393952 0.1393952 
## [9]  {coinhuntworld,find}           => {ethereum} 0.1393952 0.4957265 
## [10] {awesome,coinhuntworld}        => {ethereum} 0.1393952 0.4960798 
## [11] {coinhuntworld,play}           => {ethereum} 0.1393952 0.4957265 
## [12] {coinhuntworld,join}           => {ethereum} 0.1393952 0.4957265 
## [13] {coinhuntworld,https}          => {ethereum} 0.1393952 0.4957265 
## [14] {coinhuntworld,numbr}          => {ethereum} 0.1393952 0.4957265 
## [15] {coinhuntworld,cryptocurrency} => {ethereum} 0.1393952 0.4957265 
##      coverage  lift     count
## [1]  0.2811937 3.556268 696  
## [2]  0.2817945 3.548685 696  
## [3]  0.2860004 3.496499 696  
## [4]  0.2876026 3.477019 696  
## [5]  0.3122371 3.202694 696  
## [6]  0.5659924 1.766808 696  
## [7]  0.5738033 1.742757 696  
## [8]  1.0000000 1.000000 696  
## [9]  0.2811937 3.556268 696  
## [10] 0.2809934 3.558803 696  
## [11] 0.2811937 3.556268 696  
## [12] 0.2811937 3.556268 696  
## [13] 0.2811937 3.556268 696  
## [14] 0.2811937 3.556268 696  
## [15] 0.2811937 3.556268 696

By LHS (etherum by lift)

## Selecting or targeting specific rules  LHS
ethRules <- apriori(data=Twitter_data,parameter = list(supp=.001, conf=.01, minlen=2),
                     appearance = list(default="rhs", lhs="nft"),
                     control=list(verbose=FALSE))
ethRules <- sort(ethRules, decreasing=TRUE, by="lift")
inspect(ethRules[1:15])
##      lhs      rhs                support     confidence coverage  lift    
## [1]  {nft} => {definitely}       0.015021029 0.03456221 0.4346085 2.300922
## [2]  {nft} => {earn}             0.004406169 0.01013825 0.4346085 2.300922
## [3]  {nft} => {give}             0.004406169 0.01013825 0.4346085 2.300922
## [4]  {nft} => {soon}             0.004406169 0.01013825 0.4346085 2.300922
## [5]  {nft} => {nguynhoivngnumbr} 0.005007010 0.01152074 0.4346085 2.300922
## [6]  {nft} => {lucky}            0.004806729 0.01105991 0.4346085 2.300922
## [7]  {nft} => {lminhhnnumbr}     0.005207290 0.01198157 0.4346085 2.300922
## [8]  {nft} => {levnduynumbr}     0.005207290 0.01198157 0.4346085 2.300922
## [9]  {nft} => {come}             0.005007010 0.01152074 0.4346085 2.300922
## [10] {nft} => {involve}          0.005207290 0.01198157 0.4346085 2.300922
## [11] {nft} => {share}            0.005007010 0.01152074 0.4346085 2.300922
## [12] {nft} => {well}             0.005207290 0.01198157 0.4346085 2.300922
## [13] {nft} => {hopefully}        0.005207290 0.01198157 0.4346085 2.300922
## [14] {nft} => {phanhuynumbr}     0.005808131 0.01336406 0.4346085 2.300922
## [15] {nft} => {lynhakynumbr}     0.005808131 0.01336406 0.4346085 2.300922
##      count
## [1]  75   
## [2]  22   
## [3]  22   
## [4]  22   
## [5]  25   
## [6]  24   
## [7]  26   
## [8]  26   
## [9]  25   
## [10] 26   
## [11] 25   
## [12] 26   
## [13] 26   
## [14] 29   
## [15] 29

Network3D

Support

TweetTrans_rules<-sort_rule_twitter_sup[1:100]

## Convert the RULES to a DATAFRAME
Rules_DF2<-DATAFRAME(TweetTrans_rules, separate = TRUE)

## Convert to char
Rules_DF2$LHS<-as.character(Rules_DF2$LHS)
Rules_DF2$RHS<-as.character(Rules_DF2$RHS)

## Remove all {}
Rules_DF2[] <- lapply(Rules_DF2, gsub, pattern='[{]', replacement='')
Rules_DF2[] <- lapply(Rules_DF2, gsub, pattern='[}]', replacement='')

## USING SUP
Rules_S<-Rules_DF2[c(1,2,3)]
names(Rules_S) <- c("SourceName", "TargetName", "Weight")

### Support
## Choose and set
Rules_Sup<-Rules_S

##################### BUILD THE NODES & EDGES #######################
edgeList<-Rules_Sup

MyGraph <- igraph::simplify(igraph::graph.data.frame(edgeList, directed=TRUE))

nodeList <- data.frame(ID = c(0:(igraph::vcount(MyGraph) - 1)), 
                       # because networkD3 library requires IDs to start at 0
                       nName = igraph::V(MyGraph)$name)

## Node Degree
nodeList <- cbind(nodeList, nodeDegree=igraph::degree(MyGraph, 
                    v = igraph::V(MyGraph), mode = "all"))

## Betweenness
BetweenNess <- igraph::betweenness(MyGraph, 
      v = igraph::V(MyGraph), 
      directed = TRUE) 

nodeList <- cbind(nodeList, nodeBetweenness=BetweenNess)

## This can change the BetweenNess value if needed

## Min-Max Normalization
##BetweenNess.norm <- (BetweenNess - min(BetweenNess))/(max(BetweenNess) - min(BetweenNess))

#################### BUILD THE EDGES #########################
getNodeID <- function(x){
  which(x == igraph::V(MyGraph)$name) - 1  #IDs start at 0
}

edgeList <- plyr::ddply(
  Rules_Sup, .variables = c("SourceName", "TargetName" , "Weight"), 
  function (x) data.frame(SourceID = getNodeID(x$SourceName), 
                          TargetID = getNodeID(x$TargetName)))
##############  Dice Sim #################
DiceSim <- igraph::similarity.dice(MyGraph, vids = igraph::V(MyGraph), mode = "all")

#Create  data frame that contains the Dice similarity between any two vertices
F1 <- function(x) {data.frame(diceSim = DiceSim[x$SourceID +1, x$TargetID + 1])}
#Place a new column in edgeList with the Dice Sim
edgeList <- plyr::ddply(edgeList,
                        .variables=c("SourceName", "TargetName", "Weight", 
                                               "SourceID", "TargetID"), 
                        function(x) data.frame(F1(x)))
##################   color #####################
COLOR_P <- colorRampPalette(c("#00FF00", "#FF0000"), 
                            bias = nrow(edgeList), space = "rgb", 
                            interpolate = "linear")
colCodes <- COLOR_P(length(unique(edgeList$diceSim)))
edges_col <- sapply(edgeList$diceSim, 
                    function(x) colCodes[which(sort(unique(edgeList$diceSim)) == x)])

## NetworkD3 Object
D3_network_Tweets <- networkD3::forceNetwork(
  Links = edgeList, # data frame that contains info about edges
  Nodes = nodeList, # data frame that contains info about nodes
  Source = "SourceID", # ID of source node 
  Target = "TargetID", # ID of target node
  Value = "Weight", # value from the edge list (data frame) that will be used to value/weight relationship amongst nodes
  NodeID = "nName", # value from the node list (data frame) that contains node description we want to use (e.g., node name)
  Nodesize = "nodeBetweenness",  # value from the node list (data frame) that contains value we want to use for a node size
  Group = "nodeDegree",  # value from the node list (data frame) that contains value we want to use for node color
  height = 700, # Size of the plot (vertical)
  width = 900,  # Size of the plot (horizontal)
  fontSize = 8, # Font size
  linkDistance = networkD3::JS("function(d) { return d.value*10; }"), # Function to determine distance between any two nodes, uses variables already defined in forceNetwork function (not variables from a data frame)
  linkWidth = networkD3::JS("function(d) { return d.value/10; }"),# Function to determine link/edge thickness, uses variables already defined in forceNetwork function (not variables from a data frame)
  opacity = 0.9, # opacity
  zoom = TRUE, # ability to zoom when click on the node
  opacityNoHover = 0.9, # opacity of labels when static
  linkColour = "red"   ###"edges_col"red"# edge colors
) 
# Plot network
D3_network_Tweets
# Save network as html file
networkD3::saveNetwork(D3_network_Tweets, 
                       "NetD3_Twitter_support.html", selfcontained = TRUE)

Confidence

TweetTrans_rules<-sort_rule_twitter_con[1:100]

## Convert the RULES to a DATAFRAME
Rules_DF2<-DATAFRAME(TweetTrans_rules, separate = TRUE)

## Convert to char
Rules_DF2$LHS<-as.character(Rules_DF2$LHS)
Rules_DF2$RHS<-as.character(Rules_DF2$RHS)

## Remove all {}
Rules_DF2[] <- lapply(Rules_DF2, gsub, pattern='[{]', replacement='')
Rules_DF2[] <- lapply(Rules_DF2, gsub, pattern='[}]', replacement='')

## USING CONF
Rules_C<-Rules_DF2[c(1,2,4)]
names(Rules_C) <- c("SourceName", "TargetName", "Weight")

### Support
## Choose and set
Rules_Sup<-Rules_C

##################### BUILD THE NODES & EDGES #######################
edgeList<-Rules_Sup

MyGraph <- igraph::simplify(igraph::graph.data.frame(edgeList, directed=TRUE))

nodeList <- data.frame(ID = c(0:(igraph::vcount(MyGraph) - 1)), 
                       # because networkD3 library requires IDs to start at 0
                       nName = igraph::V(MyGraph)$name)

## Node Degree
nodeList <- cbind(nodeList, nodeDegree=igraph::degree(MyGraph, 
                    v = igraph::V(MyGraph), mode = "all"))

## Betweenness
BetweenNess <- igraph::betweenness(MyGraph, 
      v = igraph::V(MyGraph), 
      directed = TRUE) 

nodeList <- cbind(nodeList, nodeBetweenness=BetweenNess)

## This can change the BetweenNess value if needed

## Min-Max Normalization
##BetweenNess.norm <- (BetweenNess - min(BetweenNess))/(max(BetweenNess) - min(BetweenNess))

#################### BUILD THE EDGES #########################
getNodeID <- function(x){
  which(x == igraph::V(MyGraph)$name) - 1  #IDs start at 0
}

edgeList <- plyr::ddply(
  Rules_Sup, .variables = c("SourceName", "TargetName" , "Weight"), 
  function (x) data.frame(SourceID = getNodeID(x$SourceName), 
                          TargetID = getNodeID(x$TargetName)))
##############  Dice Sim #################
DiceSim <- igraph::similarity.dice(MyGraph, vids = igraph::V(MyGraph), mode = "all")

#Create  data frame that contains the Dice similarity between any two vertices
F1 <- function(x) {data.frame(diceSim = DiceSim[x$SourceID +1, x$TargetID + 1])}
#Place a new column in edgeList with the Dice Sim
edgeList <- plyr::ddply(edgeList,
                        .variables=c("SourceName", "TargetName", "Weight", 
                                               "SourceID", "TargetID"), 
                        function(x) data.frame(F1(x)))
##################   color #####################
COLOR_P <- colorRampPalette(c("#00FF00", "#FF0000"), 
                            bias = nrow(edgeList), space = "rgb", 
                            interpolate = "linear")
colCodes <- COLOR_P(length(unique(edgeList$diceSim)))
edges_col <- sapply(edgeList$diceSim, 
                    function(x) colCodes[which(sort(unique(edgeList$diceSim)) == x)])

## NetworkD3 Object
D3_network_Tweets <- networkD3::forceNetwork(
  Links = edgeList, # data frame that contains info about edges
  Nodes = nodeList, # data frame that contains info about nodes
  Source = "SourceID", # ID of source node 
  Target = "TargetID", # ID of target node
  Value = "Weight", # value from the edge list (data frame) that will be used to value/weight relationship amongst nodes
  NodeID = "nName", # value from the node list (data frame) that contains node description we want to use (e.g., node name)
  Nodesize = "nodeBetweenness",  # value from the node list (data frame) that contains value we want to use for a node size
  Group = "nodeDegree",  # value from the node list (data frame) that contains value we want to use for node color
  height = 700, # Size of the plot (vertical)
  width = 900,  # Size of the plot (horizontal)
  fontSize = 8, # Font size
  linkDistance = networkD3::JS("function(d) { return d.value*10; }"), # Function to determine distance between any two nodes, uses variables already defined in forceNetwork function (not variables from a data frame)
  linkWidth = networkD3::JS("function(d) { return d.value/10; }"),# Function to determine link/edge thickness, uses variables already defined in forceNetwork function (not variables from a data frame)
  opacity = 0.9, # opacity
  zoom = TRUE, # ability to zoom when click on the node
  opacityNoHover = 0.9, # opacity of labels when static
  linkColour = "red"   ###"edges_col"red"# edge colors
) 
# Plot network
D3_network_Tweets
# Save network as html file
networkD3::saveNetwork(D3_network_Tweets, 
                       "NetD3_Twitter_confidence.html", selfcontained = TRUE)

Lift

TweetTrans_rules<-sort_rule_twitter_lift[1:100]

## Convert the RULES to a DATAFRAME
Rules_DF2<-DATAFRAME(TweetTrans_rules, separate = TRUE)

## Convert to char
Rules_DF2$LHS<-as.character(Rules_DF2$LHS)
Rules_DF2$RHS<-as.character(Rules_DF2$RHS)

## Remove all {}
Rules_DF2[] <- lapply(Rules_DF2, gsub, pattern='[{]', replacement='')
Rules_DF2[] <- lapply(Rules_DF2, gsub, pattern='[}]', replacement='')

## USING LIFT
Rules_L<-Rules_DF2[c(1,2,5)]
names(Rules_L) <- c("SourceName", "TargetName", "Weight")

### Support
## Choose and set
Rules_Sup<-Rules_L

##################### BUILD THE NODES & EDGES #######################
edgeList<-Rules_Sup

MyGraph <- igraph::simplify(igraph::graph.data.frame(edgeList, directed=TRUE))

nodeList <- data.frame(ID = c(0:(igraph::vcount(MyGraph) - 1)), 
                       # because networkD3 library requires IDs to start at 0
                       nName = igraph::V(MyGraph)$name)

## Node Degree
nodeList <- cbind(nodeList, nodeDegree=igraph::degree(MyGraph, 
                    v = igraph::V(MyGraph), mode = "all"))

## Betweenness
BetweenNess <- igraph::betweenness(MyGraph, 
      v = igraph::V(MyGraph), 
      directed = TRUE) 

nodeList <- cbind(nodeList, nodeBetweenness=BetweenNess)

## This can change the BetweenNess value if needed

## Min-Max Normalization
##BetweenNess.norm <- (BetweenNess - min(BetweenNess))/(max(BetweenNess) - min(BetweenNess))

#################### BUILD THE EDGES #########################
getNodeID <- function(x){
  which(x == igraph::V(MyGraph)$name) - 1  #IDs start at 0
}

edgeList <- plyr::ddply(
  Rules_Sup, .variables = c("SourceName", "TargetName" , "Weight"), 
  function (x) data.frame(SourceID = getNodeID(x$SourceName), 
                          TargetID = getNodeID(x$TargetName)))
##############  Dice Sim #################
DiceSim <- igraph::similarity.dice(MyGraph, vids = igraph::V(MyGraph), mode = "all")

#Create  data frame that contains the Dice similarity between any two vertices
F1 <- function(x) {data.frame(diceSim = DiceSim[x$SourceID +1, x$TargetID + 1])}
#Place a new column in edgeList with the Dice Sim
edgeList <- plyr::ddply(edgeList,
                        .variables=c("SourceName", "TargetName", "Weight", 
                                               "SourceID", "TargetID"), 
                        function(x) data.frame(F1(x)))
##################   color #####################
COLOR_P <- colorRampPalette(c("#00FF00", "#FF0000"), 
                            bias = nrow(edgeList), space = "rgb", 
                            interpolate = "linear")
colCodes <- COLOR_P(length(unique(edgeList$diceSim)))
edges_col <- sapply(edgeList$diceSim, 
                    function(x) colCodes[which(sort(unique(edgeList$diceSim)) == x)])

## NetworkD3 Object
D3_network_Tweets <- networkD3::forceNetwork(
  Links = edgeList, # data frame that contains info about edges
  Nodes = nodeList, # data frame that contains info about nodes
  Source = "SourceID", # ID of source node 
  Target = "TargetID", # ID of target node
  Value = "Weight", # value from the edge list (data frame) that will be used to value/weight relationship amongst nodes
  NodeID = "nName", # value from the node list (data frame) that contains node description we want to use (e.g., node name)
  Nodesize = "nodeBetweenness",  # value from the node list (data frame) that contains value we want to use for a node size
  Group = "nodeDegree",  # value from the node list (data frame) that contains value we want to use for node color
  height = 700, # Size of the plot (vertical)
  width = 900,  # Size of the plot (horizontal)
  fontSize = 8, # Font size
  linkDistance = networkD3::JS("function(d) { return d.value*10; }"), # Function to determine distance between any two nodes, uses variables already defined in forceNetwork function (not variables from a data frame)
  linkWidth = networkD3::JS("function(d) { return d.value/10; }"),# Function to determine link/edge thickness, uses variables already defined in forceNetwork function (not variables from a data frame)
  opacity = 0.9, # opacity
  zoom = TRUE, # ability to zoom when click on the node
  opacityNoHover = 0.9, # opacity of labels when static
  linkColour = "red"   ###"edges_col"red"# edge colors
) 
# Plot network
D3_network_Tweets
# Save network as html file
networkD3::saveNetwork(D3_network_Tweets, 
                       "NetD3_Twitter_lift.html", selfcontained = TRUE)

igraph

library(igraph)

Support

TweetTrans_rules<-sort_rule_twitter_lift[1:150]

## Convert the RULES to a DATAFRAME
Rules_DF2<-DATAFRAME(TweetTrans_rules, separate = TRUE)

## Convert to char
Rules_DF2$LHS<-as.character(Rules_DF2$LHS)
Rules_DF2$RHS<-as.character(Rules_DF2$RHS)

## Remove all {}
Rules_DF2[] <- lapply(Rules_DF2, gsub, pattern='[{]', replacement='')
Rules_DF2[] <- lapply(Rules_DF2, gsub, pattern='[}]', replacement='')

## USING LIFT
Rules_L<-Rules_DF2[c(1,2,3)]
names(Rules_L) <- c("SourceName", "TargetName", "Weight")

### Support
## Choose and set
Rules_Sup<-Rules_L

##################### BUILD THE NODES & EDGES #######################
edgeList<-Rules_Sup

MyGraph <- igraph::simplify(igraph::graph.data.frame(edgeList, directed=TRUE))

nodeList <- data.frame(ID = c(0:(igraph::vcount(MyGraph) - 1)), 
                       # because networkD3 library requires IDs to start at 0
                       nName = igraph::V(MyGraph)$name)

########################### igraph ######################################
# data preparation
Myedges <- edgeList
Mynodes <- nodeList
names(Myedges) <- c("from", "to", "weight")
names(Mynodes) <- c("id","label")
# Self-defined function for data processing
labelbyIndex = function(Myedges,Mynodes){

  for (i in 1:nrow(Mynodes)) {
    # from
    Myedges$from[Myedges$from==Mynodes$label[i]] <- Mynodes$id[i]
    # to
    Myedges$to[Myedges$to==Mynodes$label[i]] <- Mynodes$id[i]
  }
  # change the type of data
  Myedges$from <- as.integer(Myedges$from)
  Myedges$to <- as.integer(Myedges$to)
  Myedges$weight <- as.numeric(Myedges$weight)
  # Myedges$weight <- as.integer(as.numeric(Myedges$weight)*100)
  return(Myedges)
}

Myedges <- labelbyIndex(Myedges,Mynodes)
My_igraph2 <- 
    igraph::graph_from_data_frame(d = Myedges, vertices = Mynodes, directed = TRUE)

# igraph::E(My_igraph2)
# igraph::E(My_igraph2)$weight
igraph::V(My_igraph2)$size = 10

E_Weight<-Myedges$weight
E(My_igraph2)$weight <- igraph::edge.betweenness(My_igraph2)
E(My_igraph2)$color <- "purple"

layout1 <- layout.fruchterman.reingold(My_igraph2)

## plot or tkplot........
# plot(My_igraph2)
tkplot(My_igraph2, edge.arrow.size = 0.3,
     # vertex.size=E_Weight*5, 
     vertex.color="lightblue",
     layout=layout1,
     edge.arrow.size=.5,
     vertex.label.cex=0.8,
     vertex.label.dist=2,
     edge.curved=0.2,
     vertex.label.color="black",
     edge.weight=5, 
     edge.width=E(My_igraph2)$weight,
     #edge_density(My_igraph2)
     ## Affect edge lengths
     rescale = FALSE, 
     ylim=c(0,14),
     xlim=c(0,20)
     )
## [1] 1

Confidence

TweetTrans_rules<-sort_rule_twitter_con[1:100]

## Convert the RULES to a DATAFRAME
Rules_DF2<-DATAFRAME(TweetTrans_rules, separate = TRUE)

## Convert to char
Rules_DF2$LHS<-as.character(Rules_DF2$LHS)
Rules_DF2$RHS<-as.character(Rules_DF2$RHS)

## Remove all {}
Rules_DF2[] <- lapply(Rules_DF2, gsub, pattern='[{]', replacement='')
Rules_DF2[] <- lapply(Rules_DF2, gsub, pattern='[}]', replacement='')

## USING LIFT
Rules_L<-Rules_DF2[c(1,2,4)]
names(Rules_L) <- c("SourceName", "TargetName", "Weight")

### Support
## Choose and set
Rules_Sup<-Rules_L

##################### BUILD THE NODES & EDGES #######################
edgeList<-Rules_Sup

MyGraph <- igraph::simplify(igraph::graph.data.frame(edgeList, directed=TRUE))

nodeList <- data.frame(ID = c(0:(igraph::vcount(MyGraph) - 1)), 
                       # because networkD3 library requires IDs to start at 0
                       nName = igraph::V(MyGraph)$name)

########################### igraph ######################################
# data preparation
Myedges <- edgeList
Mynodes <- nodeList
names(Myedges) <- c("from", "to", "weight")
names(Mynodes) <- c("id","label")
Myedges <- labelbyIndex(Myedges,Mynodes)


My_igraph2 <- 
    igraph::graph_from_data_frame(d = Myedges, vertices = Mynodes, directed = TRUE)

# igraph::E(My_igraph2)
# igraph::E(My_igraph2)$weight
igraph::V(My_igraph2)$size = 10

E_Weight<-Myedges$weight
E(My_igraph2)$weight <- igraph::edge.betweenness(My_igraph2)
E(My_igraph2)$color <- "purple"

layout1 <- layout.fruchterman.reingold(My_igraph2)


## plot or tkplot........
# plot(My_igraph2)
tkplot(My_igraph2, edge.arrow.size = 0.3,
     # vertex.size=E_Weight*5, 
     vertex.color="lightblue",
     layout=layout1,
     edge.arrow.size=.5,
     vertex.label.cex=0.8,
     vertex.label.dist=2,
     edge.curved=0.2,
     vertex.label.color="black",
     edge.weight=5, 
     edge.width=E(My_igraph2)$weight,
     #edge_density(My_igraph2)
     ## Affect edge lengths
     rescale = FALSE, 
     ylim=c(0,14),
     xlim=c(0,20)
     )
## [1] 2

Lift

TweetTrans_rules<-sort_rule_twitter_lift[1:150]

## Convert the RULES to a DATAFRAME
Rules_DF2<-DATAFRAME(TweetTrans_rules, separate = TRUE)

## Convert to char
Rules_DF2$LHS<-as.character(Rules_DF2$LHS)
Rules_DF2$RHS<-as.character(Rules_DF2$RHS)

## Remove all {}
Rules_DF2[] <- lapply(Rules_DF2, gsub, pattern='[{]', replacement='')
Rules_DF2[] <- lapply(Rules_DF2, gsub, pattern='[}]', replacement='')

## USING LIFT
Rules_L<-Rules_DF2[c(1,2,5)]
names(Rules_L) <- c("SourceName", "TargetName", "Weight")

### Support
## Choose and set
Rules_Sup<-Rules_L

##################### BUILD THE NODES & EDGES #######################
edgeList<-Rules_Sup

MyGraph <- igraph::simplify(igraph::graph.data.frame(edgeList, directed=TRUE))

nodeList <- data.frame(ID = c(0:(igraph::vcount(MyGraph) - 1)), 
                       # because networkD3 library requires IDs to start at 0
                       nName = igraph::V(MyGraph)$name)

########################### igraph ######################################
# data preparation
Myedges <- edgeList
Mynodes <- nodeList
names(Myedges) <- c("from", "to", "weight")
names(Mynodes) <- c("id","label")
Myedges <- labelbyIndex(Myedges,Mynodes)

My_igraph2 <- 
    igraph::graph_from_data_frame(d = Myedges, vertices = Mynodes, directed = TRUE)

# igraph::E(My_igraph2)
# igraph::E(My_igraph2)$weight
igraph::V(My_igraph2)$size = 10

E_Weight<-Myedges$weight
E(My_igraph2)$weight <- igraph::edge.betweenness(My_igraph2)
E(My_igraph2)$color <- "purple"

layout1 <- layout.fruchterman.reingold(My_igraph2)

tkplot(My_igraph2, edge.arrow.size = 0.3,
     # vertex.size=E_Weight*5, 
     vertex.color="lightblue",
     layout=layout1,
     edge.arrow.size=.5,
     vertex.label.cex=0.8,
     vertex.label.dist=2,
     edge.curved=0.2,
     vertex.label.color="black",
     edge.weight=5, 
     edge.width=E(My_igraph2)$weight,
     #edge_density(My_igraph2)
     ## Affect edge lengths
     rescale = FALSE, 
     ylim=c(0,14),
     xlim=c(0,20)
     )
## [1] 3

visNetwork

require(visNetwork, quietly = TRUE)

Support

TweetTrans_rules<-sort_rule_twitter_sup[1:100]

## Convert the RULES to a DATAFRAME
Rules_DF2<-DATAFRAME(TweetTrans_rules, separate = TRUE)

## Convert to char
Rules_DF2$LHS<-as.character(Rules_DF2$LHS)
Rules_DF2$RHS<-as.character(Rules_DF2$RHS)

## Remove all {}
Rules_DF2[] <- lapply(Rules_DF2, gsub, pattern='[{]', replacement='')
Rules_DF2[] <- lapply(Rules_DF2, gsub, pattern='[}]', replacement='')

## USING LIFT
Rules_L<-Rules_DF2[c(1,2,3)]
names(Rules_L) <- c("SourceName", "TargetName", "Weight")

### Support
## Choose and set
Rules_Sup<-Rules_L

##################### BUILD THE NODES & EDGES #######################
edgeList<-Rules_Sup

MyGraph <- igraph::simplify(igraph::graph.data.frame(edgeList, directed=TRUE))

nodeList <- data.frame(ID = c(0:(igraph::vcount(MyGraph) - 1)), 
                       # because networkD3 library requires IDs to start at 0
                       nName = igraph::V(MyGraph)$name)

########################### visNetwork ######################################
# data preparation
Myedges <- edgeList
Mynodes <- nodeList
names(Myedges) <- c("from", "to", "weight")
names(Mynodes) <- c("id","label")
Myedges <- labelbyIndex(Myedges,Mynodes)

Myedges <- Myedges[,c(1,2)]

(graph <- visNetwork::visNetwork(Mynodes, Myedges, width = "100%"))# %>%
  # visOptions(highlightNearest = TRUE, nodesIdSelection = TRUE)

visSave(graph, "visNetwork_support.html", selfcontained = TRUE, background = "white")

Confidence

TweetTrans_rules<-sort_rule_twitter_con[1:100]

## Convert the RULES to a DATAFRAME
Rules_DF2<-DATAFRAME(TweetTrans_rules, separate = TRUE)

## Convert to char
Rules_DF2$LHS<-as.character(Rules_DF2$LHS)
Rules_DF2$RHS<-as.character(Rules_DF2$RHS)

## Remove all {}
Rules_DF2[] <- lapply(Rules_DF2, gsub, pattern='[{]', replacement='')
Rules_DF2[] <- lapply(Rules_DF2, gsub, pattern='[}]', replacement='')

## USING LIFT
Rules_L<-Rules_DF2[c(1,2,4)]
names(Rules_L) <- c("SourceName", "TargetName", "Weight")

### Support
## Choose and set
Rules_Sup<-Rules_L

##################### BUILD THE NODES & EDGES #######################
edgeList<-Rules_Sup

MyGraph <- igraph::simplify(igraph::graph.data.frame(edgeList, directed=TRUE))

nodeList <- data.frame(ID = c(0:(igraph::vcount(MyGraph) - 1)), 
                       # because networkD3 library requires IDs to start at 0
                       nName = igraph::V(MyGraph)$name)

########################### visNetwork ######################################
# data preparation
Myedges <- edgeList
Mynodes <- nodeList
names(Myedges) <- c("from", "to", "weight")
names(Mynodes) <- c("id","label")
Myedges <- labelbyIndex(Myedges,Mynodes)

Myedges <- Myedges[,c(1,2)]

(graph <- visNetwork::visNetwork(Mynodes, Myedges, width = "100%")%>%
  visOptions(highlightNearest = TRUE, nodesIdSelection = TRUE))# %>%
  # visEdges(arrows = 'from')

visSave(graph, "visNetwork_confidence.html", selfcontained = TRUE, background = "white")

Lift

TweetTrans_rules<-sort_rule_twitter_lift[1:100]

## Convert the RULES to a DATAFRAME
Rules_DF2<-DATAFRAME(TweetTrans_rules, separate = TRUE)

## Convert to char
Rules_DF2$LHS<-as.character(Rules_DF2$LHS)
Rules_DF2$RHS<-as.character(Rules_DF2$RHS)

## Remove all {}
Rules_DF2[] <- lapply(Rules_DF2, gsub, pattern='[{]', replacement='')
Rules_DF2[] <- lapply(Rules_DF2, gsub, pattern='[}]', replacement='')

## USING LIFT
Rules_L<-Rules_DF2[c(1,2,5)]
names(Rules_L) <- c("SourceName", "TargetName", "Weight")

### Support
## Choose and set
Rules_Sup<-Rules_L

##################### BUILD THE NODES & EDGES #######################
edgeList<-Rules_Sup

MyGraph <- igraph::simplify(igraph::graph.data.frame(edgeList, directed=TRUE))

nodeList <- data.frame(ID = c(0:(igraph::vcount(MyGraph) - 1)), 
                       # because networkD3 library requires IDs to start at 0
                       nName = igraph::V(MyGraph)$name)

########################### visNetwork ######################################
# data preparation
Myedges <- edgeList
Mynodes <- nodeList
names(Myedges) <- c("from", "to", "weight")
names(Mynodes) <- c("id","label")
Myedges <- labelbyIndex(Myedges,Mynodes)

Myedges <- Myedges[,c(1,2)]

graph <- visNetwork::visNetwork(Mynodes, Myedges, width = "100%") # %>%
  # visOptions(highlightNearest = TRUE, nodesIdSelection = TRUE)%>% 
  # visHierarchicalLayout() 

visSave(graph, "visNetwork_lift.html", selfcontained = TRUE, background = "white")

Discussion for professionals

With the sorting support plot, the key or center of the whole tweets all go around by “cryptocurrency” and two keywords “http” and “numbr” represent an interesting fact. At the start of the crypto world, the cryptocurrency is derived from the web, and people like to discuss a series of numerical indicators, like the price of Bitcoin, the computing power, and even the number of blocks. Then, after several years of development, ideals of de-centralized crypto worlds spread around the world. With more people coming in, some derivatives of cryptocurrency have been invented, such as the defi, NFT, private chain (BSC), and other associated activities or platforms, like the airdrop sessions and formal exchange (Binance), and so on. Now people focus more on the derivatives of the bitcoin rather in the elementary technology of bitcoin.

Graph and discussion for common people

networkD3

library(networkD3)

Support

TweetTrans_rules<-sort_rule_twitter_sup[1:100]

## Convert the RULES to a DATAFRAME
Rules_DF2<-DATAFRAME(TweetTrans_rules, separate = TRUE)

## Convert to char
Rules_DF2$LHS<-as.character(Rules_DF2$LHS)
Rules_DF2$RHS<-as.character(Rules_DF2$RHS)

## Remove all {}
Rules_DF2[] <- lapply(Rules_DF2, gsub, pattern='[{]', replacement='')
Rules_DF2[] <- lapply(Rules_DF2, gsub, pattern='[}]', replacement='')

## USING LIFT
Rules_L<-Rules_DF2[c(1,2,3)]
names(Rules_L) <- c("SourceName", "TargetName", "Weight")

### Support
## Choose and set
Rules_Sup<-Rules_L

##################### BUILD THE NODES & EDGES #######################
edgeList<-Rules_Sup

MyGraph <- igraph::simplify(igraph::graph.data.frame(edgeList, directed=TRUE))

nodeList <- data.frame(ID = c(0:(igraph::vcount(MyGraph) - 1)), 
                       # because networkD3 library requires IDs to start at 0
                       nName = igraph::V(MyGraph)$name)

########################### visNetwork ######################################
# data preparation
Myedges <- edgeList
Mynodes <- nodeList
names(Myedges) <- c("from", "to", "weight")
names(Mynodes) <- c("id","label")
Myedges <- labelbyIndex(Myedges,Mynodes)

(network <- networkD3::forceNetwork(Links = Myedges, Nodes = Mynodes, Source = "from",
             Target = "to", Value = "weight", NodeID = "label",
             Group = "id", opacity = 0.9,zoom = TRUE,opacityNoHover = 0.9))
saveNetwork(network, "networkd3_support.html", selfcontained = TRUE)

Discussion

The above interactive plot represents the mutual association between different keywords. each node represents the keywords (from one word to two keywords), and each edge represents the relationship with another keyword. For example, in the above plot, the “cryptocurrency” keywords linked almost every keyword, which means that the word “cryptocurrency” almost occurs along with each other words in a tweet. On the contrary, the word “https,numbr” just has the link to the “cryptocurrency”, showing that it will only be in a tweet when the keyword “cryptocurrency” occurs at the same time.

Confidence

TweetTrans_rules<-sort_rule_twitter_con[1:100]

## Convert the RULES to a DATAFRAME
Rules_DF2<-DATAFRAME(TweetTrans_rules, separate = TRUE)

## Convert to char
Rules_DF2$LHS<-as.character(Rules_DF2$LHS)
Rules_DF2$RHS<-as.character(Rules_DF2$RHS)

## Remove all {}
Rules_DF2[] <- lapply(Rules_DF2, gsub, pattern='[{]', replacement='')
Rules_DF2[] <- lapply(Rules_DF2, gsub, pattern='[}]', replacement='')

## USING LIFT
Rules_L<-Rules_DF2[c(1,2,4)]
names(Rules_L) <- c("SourceName", "TargetName", "Weight")

### Support
## Choose and set
Rules_Sup<-Rules_L

##################### BUILD THE NODES & EDGES #######################
edgeList<-Rules_Sup

MyGraph <- igraph::simplify(igraph::graph.data.frame(edgeList, directed=TRUE))

nodeList <- data.frame(ID = c(0:(igraph::vcount(MyGraph) - 1)), 
                       # because networkD3 library requires IDs to start at 0
                       nName = igraph::V(MyGraph)$name)

########################### visNetwork ######################################
# data preparation
Myedges <- edgeList
Mynodes <- nodeList
names(Myedges) <- c("from", "to", "weight")
names(Mynodes) <- c("id","label")
Myedges <- labelbyIndex(Myedges,Mynodes)

(network <- networkD3::forceNetwork(Links = Myedges, Nodes = Mynodes, Source = "from",
             Target = "to", Value = "weight", NodeID = "label",
             Group = "id", opacity = 0.9,zoom = TRUE,opacityNoHover = 0.9))
saveNetwork(network, "networkd3_confidence.html", selfcontained = TRUE)

Lift

TweetTrans_rules<-sort_rule_twitter_lift[1:100]

## Convert the RULES to a DATAFRAME
Rules_DF2<-DATAFRAME(TweetTrans_rules, separate = TRUE)

## Convert to char
Rules_DF2$LHS<-as.character(Rules_DF2$LHS)
Rules_DF2$RHS<-as.character(Rules_DF2$RHS)

## Remove all {}
Rules_DF2[] <- lapply(Rules_DF2, gsub, pattern='[{]', replacement='')
Rules_DF2[] <- lapply(Rules_DF2, gsub, pattern='[}]', replacement='')

## USING LIFT
Rules_L<-Rules_DF2[c(1,2,5)]
names(Rules_L) <- c("SourceName", "TargetName", "Weight")

### Support
## Choose and set
Rules_Sup<-Rules_L

##################### BUILD THE NODES & EDGES #######################
edgeList<-Rules_Sup

MyGraph <- igraph::simplify(igraph::graph.data.frame(edgeList, directed=TRUE))

nodeList <- data.frame(ID = c(0:(igraph::vcount(MyGraph) - 1)), 
                       # because networkD3 library requires IDs to start at 0
                       nName = igraph::V(MyGraph)$name)

########################### visNetwork ######################################
# data preparation
Myedges <- edgeList
Mynodes <- nodeList
names(Myedges) <- c("from", "to", "weight")
names(Mynodes) <- c("id","label")
Myedges <- labelbyIndex(Myedges,Mynodes)

(network <- networkD3::forceNetwork(Links = Myedges, Nodes = Mynodes, Source = "from",
             Target = "to", Value = "weight", NodeID = "label",
             Group = "id", opacity = 0.9,zoom = TRUE,opacityNoHover = 0.9))
saveNetwork(network, "networkd3_lift.html", selfcontained = TRUE)

Discussion

Lift represents the positive relationship commonly. In the above interactive plot, for example, there is a link between “mateverse” and a series of words, like the “airdrop,nft”, “airdrop,pancakeswap”,“binance” and so on. These keywords link is a good way for people to understand what is NFT

For instance, by using this link map, we can have a deeper and complete view of what is “NFT”. Starting with the central word “Metaverse” which is just a company and it focused on making the “NFT” and selling it. Then, finding its linked words “airdrop”. “Airdrop” represents a series of activities that attract people to participate in sessions for increasing public interest. Next, the keywords “binance” and “BSC”. Binance is a cryptocurreny exchange, and “BSC” is its private chain for supporting a lot of airdrop activities. Finally, some keywords seem to be similar to “playtoearn”. In the fact, to attract more people to participate in the “NFT” activities, a lot of platforms implement a lot of interesting small games to keep participants.

In a summary, for professionals, these kinds of plots and ARM help dig out some hidden useful information. Besides that, for common people, these kinds of interactive plots are also good ways to help new people to know what happened in certain areas.